Learning to lemmatise Polish noun phrases

نویسنده

  • Adam Radziszewski
چکیده

We present a novel approach to noun phrase lemmatisation where the main phase is cast as a tagging problem. The idea draws on the observation that the lemmatisation of almost all Polish noun phrases may be decomposed into transformation of singular words (tokens) that make up each phrase. We perform evaluation, which shows results similar to those obtained earlier by a rule-based system, while our approach allows to separate chunking from lemmatisation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Use of Articles in Learning English as a Foreign Language: A Study of Iranian English Undergraduates

The significance of error analysis for the learner, the teacher and the researcher is now widely recognized. Earlier studies of error analysis concentrated on intersystematic comparison of the “native language” and the “target language” and drew the required data largely from intuitions and impressionistic observations. This study was conducted on the basis of the following observations: (1) to...

متن کامل

Lemmatization of Multi-word Common Noun Phrases and Named Entities in Polish

In the paper we present a tool for lemmatization of multi-word common noun phrases and named entities for Polish called PoLem1. The tool is based on a set of manually crafted rules and heuristics utilizing a set of dictionaries (including morphological, named entities and inflection patterns). The accuracy of lemmatization obtained by the tool reached 97.99% on a dataset with multi-word common ...

متن کامل

The Puzzle of Case Agreement between Numeral Phrases and Predicative Adjectives in Polish

This paper addresses the optionality of case agreement between a numeral phrase in the subject position and its modifying or predicating adjectives in Polish: such adjectives either agree with the numeral or – apparently – reach into the numeral phrase and agree with the noun phrase within it. While previous analyses of this phenomenon postulated special agreement mechanisms, we account for the...

متن کامل

Towards the Lemmatisation of Polish Nominal Syntactic Groups Using a Shallow Grammar

While morphological analysers and taggers usually assign lemmata to wordforms, those tools focus on single words. For some tasks a tool that lemmatises (and thus normalises) whole phrases would be more appropriate. The paper presents, discusses and evaluates a set of tools to lemmatise nominal groups, based on a shallow grammar for Polish. The tools reach an overall success rate of over 58%, an...

متن کامل

Some Aspects of Semantic Representation of Polish Determiners 185 Some Aspects of Semantic Representation of Polish Determiners Wybrane aspekty reprezentacji semantycznej określników języka polskiego

The paper concerns some methods of semantic analysis of Polish determiners which can be used in Machine Translation. First, a brief summary of traditional approaches to the semantics of Polish noun phrases is presented together with a short discussion. The difference between reference and quantification is argued to be important for proper understanding of some phenomena. Next, a unified model ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013